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SUMMARY  PAGE 


THE  PROBLEM 

To  evaluate  the  effects  upon  speech  produced  in  a  helium-rich  atmos¬ 
phere  when  the  talker’s  auditory  feedback  is  masked  by  loud  noise. 

FINDINGS 

Mean  intelligibility  scores  significantly  improved  about  10  percent  for 
both  speaking  in  air  and  in  helium  when  loud  noise  interfered  with  the 
talker’s  ability  to  hear  what  he  was  saying.  However,  alterations  made  by 
the  talker  to  improve  his  speech  intelligibility  when  speaking  in  loud  noise 
apparently  were  not  related  to  either  the  acoustic  character  of  vowels  as 
revealed  by  spectrograms  or  long-term  spectra  of  the  voice. 

APPLICATION 

Information  contained  in  this  report  is  useful  to  the  design  of  commu¬ 
nication  systems  which  will  improve  production  of  speech  by  divers  during 
deep-submergence. 
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ABSTRACT 


Acoustic  and  intelligibility  analyses  were  made  of  speech  from  five 
talkers  breathing  air  or  a  mixture  of  helium  and  oxygen,  when  their  speech 
was  or  was  not  masked  by  loud  noise  of  95  decibels  sound  pressure  level 
re  .0002  microbar.  Mean  intelligibility  scores,  as  determined  from  responses 
by  26  listeners,  significantly  improved  about  10  percentage  points  for  both 
air  and  helium  conditions  when  noise  interfered  with  a  talker’s  ability  to 
hear  his  own  speech.  The  average  long-term  power  spectra  of  speech  in  air 
and  speech  in  the  helium-mix  did  not  differ  to  an  appreciable  degree  as  had 
been  expected.  However,  sound  spectrograms  for  the  helium-speech  re¬ 
vealed  upward  frequency  shifts  as  typically  reported.  But  neither  the  aver¬ 
age  spectra  nor  the  spectrograms  of  helium-speech  and  speech  in  air  showed 
significant  differences  between  talking  in  noise  versus  talking  in  quiet.  We 
conclude  that  alterations  made  to  improve  intelligibility  while  speaking  in 
loud  noise  are  not  closely  related  to  the  acoustic  variations  analyzed  in  this 
investigation. 
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AUDITORY  FEEDBACK  AND  HELIUM-SPEECH 


I.  INTRODUCTION 

When  deep  sea  divers  are  at  work  under¬ 
water,  or  in  simulated  underwater  situations 
such  as  pressure  chambers,  they  have  diffi¬ 
culty  communicating  by  voice  with  surface 
personnel  and  with  other  divers,  due  to  the 
unusual  sound  of  their  speech  while  in  these 
situations.  There  are  two  main  reasons  for 
the  unusual  sounding  speech  of  deep  sea 
divers.  First,  the  increased  pressure  inter¬ 
feres  with  vocal  production  causing  a  nasal 
and  crisp  quality.  Second,  divers  typically 
breathe  a  helium-rich  mixture  at  great  depths 
and  this  distorts  the  resonant  characteristics 
of  the  vocal  cavities  producing  a  “Donald 
Duck”  quality  of  speech.1 

Fairbanks2  theorized  that  the  speaking 
system  is  a  bioacoustic  servosystem  which 
monitors  and  controls  its  output  by  auditory 
means.  Any  interference  with  auditory  feed¬ 
back,  or  the  ability  to  hear  oneself  talking, 
would  be  reflected  in  one’s  speech.  It  follows 
that  secondary  effects  upon  divers’  speech 
may  result  from  high  ambient  noise  levels 
and  poor  hearing  ability  of  men  in  pressure 
chambers  or  underwater. 

II.  PURPOSE 

The  purpose  of  the  present  study  was  to 
investigate  the  effects  of  helium  and  masked 
auditory  feedback  upon  speech  in  terms  of 
the  average  power  spectrum,  sound  spectro¬ 
graph,  and  intelligibility.  The  power  spec¬ 
trum  indicates  the  average  acoustical  power 
of  speech  at  one  cycle  bandwidths  along  a 
frequency  continuum  for  an  extended  period 
of  time.  The  spectrograph  shows  the  fre¬ 
quencies  and  intensities  of  formants,  which 
are  concentrations  of  acoustical  energy  of  the 
speech  signal,  as  a  function  of  time. 

The  following  predictions  were  proposed : 

a.  Comparison  of  sound  spectrograms  of 
speech  at  normal  atmospheric  pressures  in 
air  and  in  an  80-20  percent  mixture  of  helium 
and  oxygen  reveal  that  formant  frequencies 
for  the  helium  condition  are  about  one  and 
one-half  times  those  for  air.  It  was  proposed 
that  this  shift  would  be  reflected  by  the  long¬ 


term  average  power  spectrum  with  maximum 
energy  shifted  upward  for  the  helium-speech. 

b.  If  a  person  were  to  hear  himself  speak¬ 
ing  abnormally,  it  is  assumed  that  he  would 
do  something  to  make  his  speech  sound  more 
natural  to  himself.  We  predicted  that  the 
talkers  in  our  study  would  show  greater 
formant  shifting  in  their  helium-speech  when 
auditory  feedback  was  masked  than  they 
would  when  they  had  normal  auditory  feed¬ 
back.  This  would  indicate  that  when  some¬ 
one  speaks  while  breathing  helium-rich  mix¬ 
tures,  he  alters  his  vocal  production  in  an  at¬ 
tempt  to  overcome  the  undesirable  changes 
caused  by  the  helium. 

c.  Camp3  found  that  there  was  an  in¬ 
crease  in  intelligibility  scores  for  speech  pro¬ 
duced  in  air  when  a  loud  masking  noise  was 
introduced  to  the  talker’s  ears.  This  led  us 
to  predict  that  a  similar  increase  in  intelligi¬ 
bility  would  occur  with  helium-speech. 

III.  METHOD 

Samples  of  speech  were  obtained  from  five 
men  who  had  normal  hearing,  according  to 
typical  audiometric  testing.  All  recordings 
were  made  in  an  anechoic  chamber  at  normal 
atmospheric  pressure.  Prior  to  recording, 
each  man  was  trained  to  speak  at  a  relatively 
constant  vocal  output  while  observing  a  VU- 
meter  (A  VU-meter  measures  Volume  Units 
of  Speech). 

For  acoustic  analyses,  the  following  sen¬ 
tence  was  recorded  by  each  man  under  the 
four  conditions  which  will  be  described  later: 
“Tomorrow  evening  at  this  hour  the  famous 
physician  Doctor  J.  0.  Lee  will  speak  to  us 
about  a  matter  of  vital  importance.” 

The  Modified  Rhyme  Test4  (MRT)  was 
used  to  test  intelligibility.  In  order  to  obtain 
a  representative  sampling  of  adult  male 
speech,  each  man  read  one-fifth  of  the  words 
in  one  MRT  list.  This  was  repeated  until  one 
list  was  available  for  each  of  the  following 
four  conditions: 

Condition  I :  Talkers  breathed  air  and  had 
no  restrictions  imposed  upon  their  hearing. 
Auditory  feedback  was  normal. 
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Condition  II:  Air  again  was  breathed,  but 
random  noise  was  introduced  binaurally  with 
earphones  at  a  measured  Sound  Pressure 
Level  (SPL)  of  95  dB. 

Condition  III:  Talkers  breathed  a  mixture 
of  80  percent  helium  and  20  percent  oxygen 
and  random  noise  at  95  dB  SPL  was  used  to 
mask  auditory  feedback. 

Condition  IV :  The  helium-oxygen  mixture 
was  breathed  and  auditory  feedback  was 
normal. 

Figure.  1  summarizes  the  recording  situa¬ 
tion.  The  masking  noise  was  produced  by  a 
General  Radio  Corporation  Random  Noise 


approximated  his  comfortable  speaking  level 
for  air  with  normal  feedback.  A  carrier 
phrase  was  used  with  the  test  words.  The 
microphone  was  an  Altec  633A  positioned  six 
inches  directly  in  front  of  the  lips  at  a  90° 
angle.  Speech  was  recorded  on  Channel  One 
of  an  Ampex  300-2  Tape  Recorder. 

A  separate  tape-loop  was  made  for  each 
“J.  O.  Lee”  sentence  as  spoken  by  each  talker 
under  each  of  the  four  conditions  understudy. 
These  loops  permitted  continuous  playback 
of  a  sentence  on  an  Ampex  PR-10  Tape  Re¬ 
corder.  The  recorder’s  output  was  passed 
through  each  of  21  third-octave  filter  pass- 
band  settings,  with  midpoints  ranging  from 


Figure  1. — Summary  diagram  of  recording  situation. 


Generator  and  fed  through  an  Altec  amplifier 
to  a  set  of  Telephonies  TDH-39  earphones  in 
MX  cushions.  Since  loud  noise  in  the  ears 
normally  causes  an  increase  in  the  intensity 
of  the  voice,  each  talker  visually  monitored 
the  level  of  his  speech  on  a  Daven  Co.  VU- 
meter.  Thus,  each  talker  kept  his  vocal  output 
at  a  relatively  constant  overall  level  which 


125  Hertz  (Hz)  to  10  kilollertz  (kHz)  on  a 
Bruel  and  Kjaer  Audio  Frequency  Spectrom¬ 
eter.  The  voltage  of  the  resultant  signal  was 
measured  with  a  Flow  Corporation  Model- 
TBM  averaging  root-mean-square  (RMS) 
voltmeter.  This  provided  the  RMS  voltage 
averaged  over  a  period  of  20  seconds  for  dif¬ 
ferent  passbands  of  speech  produced  under 
four  conditions. 
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A  Kay  Electric  Company  Missile  Data- 
Reduction  Spectrograph,  or  “Missilizer,”  was 
used  to  produce  spectrograms  of  the  recorded 
sentences. 

Recordings  of  the  Modified  Rhyme  Test 
were  presented  to  26  normal-hearing  Navy 
enlisted  men  in  a  group  audio-testing  room 
which  contains  50  matched  monaural  head¬ 
sets.  The  order  of  presentation  on  the  tapes 
was  by  talker.  The  lists  were  presented  mon- 
aurally  at  an  average  SPL  of  70  dB  in  TDH- 
39  earphones  embedded  in  Supra-aural  cush¬ 
ions.  This  level  was  obtained  by  determining 
the  average  of  speech  peaks  on  a  VU-meter 
for  a  list  and  recording  a  1000  Hz  calibration 
tone  at  that  same  level.  Then,  the  output  of 
the  tone  was  set  to  measure  70  dB  using  a 
flat-plate  coupler  between  headset  and  an  SL 
meter..  In  order  to  maximize  differences  in 
intelligibility  between  conditions,  random 
noise  was  mixed  with  the  speech  signal  to 
produce  a  speech-to-noise  ratio  of  minus  5  dB. 

Listeners  responded  on  answer  sheets  con¬ 
taining  6-word  multiple  choice  blocks  as 
described  by  House,  et  al.4  Intelligibility  was 
defined  as  the  percent  correct  responses. 

IV.  RESULTS  AND  DISCUSSION 

Two  operations  were  performed  to  pro¬ 
duce  the  average  power  spectrum.  First,  the 
values  found  at  each  filter  band  were  con¬ 
verted  to  voltage  in  decibels  relative  to  the 
unfiltered  speech  signal.  This  gave  the  power 
at  each  third-octave  band  of  speech  re  the 
overall  level.  Secondly,  a  linear  conversion 
was  made  to  transform  the  third-octave 
power  spectrum  to  the  more  basic  “Spectrum 
Level”,  which  is  the  level  that  would  be 
measured  if  our  analyzer  had  an  ideal  re¬ 
sponse  characteristic  with  a  bandwidth  of 
one  cycle5. 

Mean  values  were  obtained  for  the  Spec¬ 
trum  Levels  of  five  speakers  and  a  final  long¬ 
term  average  power  spectrum  of  speech  was 
obtained  for  each  condition.  Figure  2  shows 
these  mean  (for  subjects)  average  (across 
time)  spectra  for  the  four  conditions  (of 
masking  and  breathing  mix) .  The  ordinant 
is  relative  voltage  level ;  the  abscissa  is  fre¬ 
quency  in  Hz.  On  first  observation,  there 


seems  to  be  little  evidence  of  formant  shift¬ 
ing,  except  for  frequencies  lower  than  about 
1200  Hz.  We  felt  that  some  obscuring  of  the 
expected  shift  might  be  due  to  variability 
among  speakers.  Therefore,  individual  power 
spectra  for  the  four  conditions  were  observed 
for  each  talker.  Figure  3,  which  is  repre¬ 
sentative  of  the  data  obtained  for  the  other 
talkers,  shows  the  long-term  average  spectra 
from  300  to  1250  Hz  for  one  talker.  Note  that 
the  levels  for  air  (solid  lines)  reach  a  maxi¬ 
mum  near  500  Hz  while  the  levels  for  helium- 
speech  (dash-lines)  reach  their  maxima 
closer  to  800  Hz.  However,  due  to  the  gen¬ 
eral  flatness  of  the  spectra  for  these  fre¬ 
quencies,  the  evidence  of  upward  shift  is 
slight  though  positive.  The  implication  is 
that  Spectrum  Level  is  not  a  very  sensitive 
measure  for  observing  shifts  in  frequency 
caused  by  breathing  gas  mixtures  containing 
high  concentrations  of  helium. 


Figure  2. — Means  of  subjects’  long-term  average 
power  spectra  of  speech  in  air  and  in  Ile02  with 
normal  and  with  masked  auditory  feedback. 

The  second  point  of  Figure  3  is  to  facilitate 
comparison  between  normal  feedback  condi¬ 
tions  and  masked  auditory  feedback.  There 
is  no  significant  difference  between  masked 
and  unmasked  conditions  for  either  the 
speech  in  air  (solid  lines)  or  the  helium- 
speech  (dash-lines). 

Spectrograms  were  made  of  the  “J.  O. 
Lee”  sentences  as  produced  by  each  speaker 
under  the  four  conditions.  Results  obtained 
for  the  same  subject  whom  we  examined  in 
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Figure  3  are  presented  in  Figure  4.  Spectro¬ 
grams  in  the  top  line,  I,  are  for  air  with  nor¬ 
mal  feedback;  the  second  line,  II,  is  for  air 
with  the  95  dB  masking  noise ;  line  III  is  for 
helium-speech  with  95  dB  noise  present ;  and 
the  bottom  line,  IV,  is  helium-speech  with 
no  noise.  In  support  of  the  analyses  of  Spec¬ 
trum  Level,  an  upward  shift  in  formant  fre¬ 
quencies  can  be  seen  but  differences  due  to 
masking  the  talker’s  auditory  feedback  with 
loud  noise  are  not  evident.  Spectrograms 
for  this  subject  are  representative  of  the 
spectrograms  obtained  with  the  other  four 
speakers  regarding  formant  position. 

The  mean  intelligibility  scores  of  the  26 
listeners  for  the  four  conditions  studied  are 
shown  in  Figure  5.  In  air,  the  introduction 
of  masking  noise  produced  an  increase  in 
mean  intelligibility  of  10.3  percent.  A  similar 
increase  of  9.3  percent  occurred  when  noise 
was  added  to  the  ears  of  speakers  talking  in 
helium.  Both  of  these  improvements  to  mean 
intelligibility  are  statistically  significant  be¬ 
yond  the  .001  level  of  confidence,  which  sup¬ 
ports  our  initial  prediction  based  on  Camp’s 
investigation  of  speech  produced  in  air  in  the 
presence  of  high  level  background  noise.  We 
conclude  that  interference  with  normal  audi- 


Figure  3. — Long-term  average  power  spectra  of 
speech  from  300  to  1250  Hertz  in  air  and  in  HeO, 
with  normal  and  with  masked  auditory  feedback. 
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Figure  4. — Spectrograms  of  running  speech  in  air 
and  in  He02  with  normal  and  with  masked  auditory 
feedback. 
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Figure  5.  Mean  intelligibility  scores  ot  speech  in 
air  and  in  HeO,  with  normal  and  with  masked  audi¬ 
tory  feedback. 


tory  feedback  causes  a  speaker  to  emphasize 
the  preciseness  of  his  speech  in  a  beneficial 
manner.  Apparently,  however,  this  variation 
in  vocal  production  related  to  neither  the 
acoustical  character  of  vowels  as  revealed  by 
spectrograms,  nor  the  long-term  average 
Spectrum  Level  of  the  voice  as  observed  here. 
Tolhurst6  has  reported  that  simple  instruc¬ 
tions  to  flyers  about  how  to  talk  more  clearly 
increased  significantly  their  speech  intelli¬ 
gibility.  Any  one  of  three  instructions,  “talk 
loudly”,  “articulate  precisely”  and  “talk 
fast”,  improved  speaker  intelligibility.  We 
believe  a  similar  improvement  would  occur 
either  with  simple  instructions  or  more  for¬ 
mal  voice  training  with  speech  produced  in 
exotic  gas  mixtures  under  stressfull  condi¬ 
tions  common  to  various  undersea  operations. 

V.  SUMMARY 

The  effects  of  auditory  feedback  upon  pro¬ 
duction  of  helium-speech  were  evaluated.  The 
long-term  average  spectra  of  speech  did  not 
reveal  striking  differences  between  speech  in 
air  and  speech  in  the  helium  mix,  as  had 
been  expected.  Spectrograms  of  vowels  re¬ 
vealed  shifts  upward  for  the  helium-speech, 
as  typically  reported.  However,  neither  the 
average  spectra  nor  the  spectrograms  of 
helium-speech  and  speech  in  air  showed  sig¬ 
nificant  differences  between  talking  in  noise 
versus  talking  in  quiet.  Mean  intelligibility 
scores  significantly  improved  about  ten  per 
cent  for  both  air  and  helium  conditions  when 
noise  interfered  with  a  talker’s  ability  to 
hear  what  he  was  saying.  However,  it  ap¬ 
pears  that  the  alterations  one  makes  to  im¬ 
prove  his  speech  intelligibility  when  speaking 
in  the  presence  of  loud  noise  are  not  closely 
related  to  either  the  acoustic  character  of 
vowels  as  revealed  by  spectrograms  or  to  the 
long-term  spectra  of  the  voice. 
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■  1  3  ABSTRACT  [ 

Acoustic  and  intelligibility  analyses  were  made  of  speech  from  five  talkers 
breathing  air  or  an  He02  mixture,  when  their  speech  was  or  was  not  masked  by 
loud  noise  of  95  decibels  Sound  Pressure  Level  re  .0002  microbar.  Mean  intelli¬ 
gibility  scores,  as  determined  from  responses  by  26  listeners,  significantly 
improved  about  10  percentage  points  for  both  air  and  helium  conditions  when 
noise  interfered  with  a  talker's  ability  to  hear  his  own  speech.  The  average 
long-term  power  spectra  of  speech  in  air  and  speech  in  the  helium-mix  did  not 
differ  to  an  appreciable  degree  as  had  been  expected.  However,  sound  spectro¬ 
grams  for  the  helium-speech  revealed  upward  frequency  shifts  as  typically 
reported.  But  neither  the  average  spectra  nor  the  spectrograms  of  helium-speech 
and  speech  in  air  showed  significant  differences  between  talking  in  noise  versus 
talking  in  quiet.  We  conclude  that  alterations  made  to  improve  intelligibility 
while  speaking  in  loud  noise  are  not  closely  related  to  the  acoustic  variations 
analyzed  in  this  investigation. 
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